26 research outputs found

    Improving Word Association Measures in Repetitive Corpora with Context Similarity Weighting

    Get PDF
    Peer reviewe

    Chapter 26 Language technology approach to “seeing” in Akkadian

    Get PDF
    One of the ways meanings of words can be understood is based on their distributional properties. Such methodology offers an interesting quantitative viewpoint on the study of the lexicography of long-extinct languages. This chapter explores the use of Pointwise Mutual Information (PMI), a well-known statistical word association measure used in collocation analysis. PMI is applied to the data in order to gain insights on the semantic nuances of Akkadian verbs of seeing (amāru, naṭālu, palāsu, dagālu, ḫiātu, barû, and subbû). To evaluate the data-driven results, the findings are compared to previous philological work by Ainsley Dicks. The analysis of the top-ranked PMI-extracted collocates provides a good overview of the typical semantic differences between the seven verbs of interest

    Kirjoitustaidon kehittyminen

    Get PDF
    Peer reviewe

    Automated Phonological Transcription of Akkadian Cuneiform Text

    Get PDF
    Peer reviewe

    BabyLemmatizer : A Lemmatizer and POS-tagger for Akkadian

    Get PDF
    We present a hybrid lemmatizer and POS-tagger for Akkadian, the language of the ancient Assyrians and Babylonians, documented from 2350 BCE to 100 CE. In our approach the text is first POS-tagged and lemmatized with TurkuNLP trained with human-verified labels, and then post-corrected with dictionary-based methods to improve the lemmatization quality. The post-correction also assigns labels with confidence scores to flag the most suspicious lemmatizations for manual validation. We demonstrate that the presented tool achieves a Lemma+POS labeling accuracy of 94%, and a lemmatization accuracy of 95% in a held-out test set.Peer reviewe

    Digital Approaches to Analyzing and Translating Emotion : What Is Love?

    Get PDF
    This chapter discusses the use of digital tools – in particular, language technology – to study the history of emotions. There are a growing number of annotated text corpora for ancient languages large enough to benefit from computational analysis. This chapter focuses on the cuneiform Akkadian texts available in the Open Richly Annotated Cuneiform Corpus (Oracc) and applies two language-technological methods, pointwise mutual information (PMI) and the fastText implementation of the continuous skip-gram model, to a dataset of 7,346 texts. To illustrate the potential of these methods, they are used to analyze the semantic domains of the verb râmu, “to love,” and its derivatives in Akkadian. Because the usage and semantic domains of a word can vary greatly between different genres, the dataset is divided into several genres, and the analysis focuses on royal inscriptions, letters, and literary text genres. The results show that, like the word love in English, râmu can denote different aspects of affection and love. It refers, for example, to erotic and sexual relationships between people, affection between family members, the king’s love of justice, and the gods’ pleasure with and acceptance of the king who fulfills divine expectations.Peer reviewe

    Semantic Domains in Akkadian Text

    Get PDF
    The article examines the possibilities offered by language technology for analyzing semantic fields in Akkadian. The corpus of data for our research group is the existing electronic corpora, Open richly annotated cuneiform corpus (ORACC). In addition to more traditional Assyriological methods, the article explores two language technological methods: Pointwise mutual information (PMI) and Word2vec.Peer reviewe

    Relatório de estágio em farmácia comunitária

    Get PDF
    Relatório de estágio realizado no âmbito do Mestrado Integrado em Ciências Farmacêuticas, apresentado à Faculdade de Farmácia da Universidade de Coimbr
    corecore